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(54) A method arul system for suggesting related documents 



(57) The document reading system passively ana- 
lyzes a document to generate margin or end notes of 
references to other docunnents that relate to annotated 
passages m the document or to the entire document. 
The invention is resporisive to the annotation of a doc- 
ument to passively generate a query that retrieves doc- 
uments that have similar content to the annotated pas- 
The retrieved documents are available to the 



reader through selectable Isriks placed in the margin 
near the annotation. Addltfonally, the oivention provides 
end notes with links to documents that are similar in con- 
tent to the overall content of the annotated document 
The Invention assists the reader by passively generating 
selectable links to related documents to assist the user 
In relating the new document to prevbusl/ read materi- 
al. 
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Description 

[0001] This invention relates generally to electronic 
document reading systems. In particular, this invention 
Is directed to an electronic document reading system 
that suggests other related documents when disp^ytng 
a first document 

[0002] Retrieving documents similar to a document 
identified by the user as behg related is known as rele> 
vance feedback. Relevance feedback is described In 
"Introductbn to Modem Infonnatton Retrieval", G. Satt- 
on et al.. McGraw Hill. (1983), incorporated herein by 
reference In its entirety. Interlaces tliat support rele- 
vance feedback conventionally require explicit actbn on 
the part of the reader and do not spontaneously offer 
suggestions of relevant documents. Information explo- 
ration interlaces designed for window-based computing 
environments typically present search results foi^ other 
relevant documents via lists in a separate window or by 
replacing Ihe visible document with the search results. 
These systems are very intrusive and interrupt the read- 
ing process. 

[0003] Hypertext interfaces display links to docu- 
ments relevant toa source document either by providing 
a margin that contains the links or by embedding the 
links in the text of the source document in the manner 
pioneered by "Hyperties." This system is described in 
"User Interface Design for the Hyperties Electronic Ert- 
cyclopedia*. by Shneidenman. Proceedings of Hypertext 
'87 . November 1987. Chapel Hill. NC. incorporated 
herein by reference in its entirety. However, these links 
are static and are created ak)ng with the source docu- 
ment by the hypertext author Some systems, such as 
Trellis, display links dynamically, but only from a fixed 
set of previously-defined links. Trellis is described in 
"Programmable Browsing Semantics arnJ Trellis", Ijy R 
Furuta et al. Proceedinqs of Hypertext *89. November 
1989. Pittsburgh. PA. ACM Press, incorporated herein 
by reference in its entirety. 

[0004] The HieNet System uses inter-node similarity 
measures to create hypertext links based on links pre- 
vtously created by the hypertext author. This system is 
described in "Hienet: A User-Centered Approach for Ati- 
tomatic Link Generation". D.T. Chang, Proceedinos of 
Hypertext '9a November 1993. Seattle, WA, ACM 
Press, incorporated herein by reference in its entirety. 
When the author creates a link from a document A to a 
document B, the system automatically adds links from 
all documents similar to document A to all documents 
similar to document B. Anchors for these automatically- 
generated links are represented by k;ons in the margin 
of the various documents. Clicking on an tcon dsplays 
a pop-up menu that contains a list of possible destina- 
tion documents that are ranked by relevance to the que- 
ly. Again, this system relies on links prevk>usly created 
by the author. 

[0005] Other conventional systems relate to hyper- 
text-like ways of displaying search results. HieNet dis- 



plays automatic links in the margin, but arH^hors in the 
margin are not relevi^rit to the content of the passage 
adjacent to the anchc ""i^Net n:-es not distinguish be- 
tween document-dot . -J ient ani passage-document 
s links. Furthermore. Hi8)\*et does not irxJicate the number 
and nature of the documents reachable through the 
margin links. 

[0006] Visualizatbn of information Retrieval System 
(hereinafter VOIR) is described in "Queries? Links? Is 
There a Difference?". Proceedings of CHI '97. G. 
Golovinsky. March 1997, Atlanta, GA. ACM Press and 
in "What the Query Told the Link: The Integration of Hy- 
pertext and Information Retrieval". Proceedings of Hy- 
pertext '97 . G. Gotovinsky, April 1997. Southhampton, 

t5 UK, ACM Press, each Incorporated herein by reference 
in its entirety. VOIR is a mechanism that dynamically 
creates and resolves hypertext links with queries that 
are computed from the text surroundlng a selected an- 
chor. VOIR uses queries to retrieve sets of documents 

20 that are related to the passage containing the selected 
anchor. VOIR does not show the user links tiiat have 
pre-established relatbnships. Rather, to submit a query 
and to estabGsh a relatkmship. the user has to pause 
and select an andior. VOIR was designed specrfk:a8y 

2S to support interactive information exploration, rather 
than to facilitate the reading process. Thus, VOIR's fo- 
cus is supporting navigatbn between documents. The 
user is thus expected to devote much cognitive effort to 
browsing. Furthermore. VOIR does not permit the user 

30 to annotate or tag documents. VOI R also does not indi- 
cate which link was selected to generate a particular dis- 
play. 

[0007] A background informatbn retrieval process 
caHed the Remembrance Agent (hereinafter RA) is de- 

35 scribed in "A Continuously Running Automated Infomna- 
tfon Retrieval System", B.J. Rhodes et aL Proceedings 
of The First Intemattonal Conference on the Practical 
Application of InteliigeRt Agents in MuRi-Agent Tet n noi- 
ogy. PAAM '96, April. 1997. Ijondon, UK, incorpc joisd 

^ herein by reference in its entirety. RA operates in an 
EMACS text vtrindow and suggests documents related 
to tfie last few lines text typed by the user. RA is de- 
signed to search through a user's private data to sug- 
gest ctocuments related to the text bekig typed. Howev- 
er, these suggestions are ephemeral and relate only to 
text that is currently being written. RA does not support 
reading tasks because It continuously replaces sugges- 
tions as the user edits the document. 
[0008] QRL is a query-based infomnatton exploration 

so aiterface that uses ink-like marics on text to specify 
boolean queries. This system Is described in "Queries- 
R-Unks: Graphical Markup for Text Navigation", by G. 
Gokjvch'msky et al.. Proceedinqs of INTERCHI '9a April 
1993. Amsterdam, The Netherlands. ACM Press, incor- 

ss porated herein by reference In its entirety. Query terms 
are selected with rectangles. Lines connect the rectan- 
gles to represent boolean AND operators. 
[0009] All of these systems require extensive user in- 
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teraction to generate links to related documents or only 
support writing. An electronic document reading system 
is needed that pasavely and unobtrusively generates 
links to related documents to support reading. 
[0010] This invention provides a method and a system 
for passively showing the reader related documents 
wlhout interfering with the reading process. 
[0011] The invenfton further provides intuitive support 
for reading by automatrcally detecting documents po- 
tentially of interest to the reader based on the reader's 
interactbn with the source document being read When 
people read text, they often make annotatkms to high- 
ii^t interesting or controversial passages and terms. 
The presence or relative density of such marks and 
scribbles may be used as an ffidfcator of the relative in- 
terest that the reader has in a particular passage. When 
a large body of documents related to the document be- 
ing read is available, the reader may be interested in 
finding related documents as part of the reading proc- 
ess. 

[0012] References to documents related to specific 
passages of interest to the user are placed in the source 
document's margins arvJ references to documents sim- 
ilar overall to the source dcxxument are inserted as end 
notes. The system and method of this invention nnaintain 
the links once they have been kJentifiedtofacii'itate non- 
linear reading and skimming. 

[001 3] A user's interests are inferred from annotations 
made while readng the source document Therefore, 
the system and method of this invention minimize cog^ 
nitive overhead in two ways: 1 ) no expressive query is 
required to identify documents related to the source doc- 
ument; and 2) selectable links to the related documents 
are provided unobtrusively n the margins and at the end 
of the document, this is shown in Figs. 2 and 3. respec- 
tively. 

[001 4] The system also Introduces suggestions to the 
reader in a manner connpatible with other interactions, 
rather than burdening the user with modal dialogues. 
Suggested documents are accessible by folbwing the 
selectable finks. However, the user does not have to act 
on a suggestion when it is made. Rather, the user can 
act on the suggestion when (or if) it makes sense to do 
so. The system and method of this Invention represent 
the type of the referenced document with an icon and 
provide a textural label to the icon to give users a better 
understanding of the target of the link. 
[0015] These and other features and advantages of 
this invention are described in or apparent from the fol- 
bwing detailed descriptbn of the preferred embodi- 
ments. 

[0016] The preferred embodiments of this invention 
will be descrft>ed in detail, with reference to the following 
figures, wherein: 

Rg. 1 is a bbck diagram of one embodiment of the 
electronic document reading system of this inven- 
tion; 



Fig. 2 shows a source document iiaving an icon in 
the margin adjacent to an annotated passage; 
Fig. 3 shows another source document having an 
endnote; and 

5 Fig. 4 is a flowchart outlining a control routine for 
one ennbodiment of this inventbn. 

[0017] Fig. 1 shows a bbck diagram of one emboc^ 
meat of a document reading system 1 0 according to this 

10 invention. The document reading system 10 includes a 
processor 1 2 communicating with a first memory 1 4 that 
stores a source document 1 6 that is cun'ently being read 
by a user on a display 18. The processor 12 also com- 
municates with a second memory 20 that stores poten- 

15 tially related target documents 22. A user interacts and 
controls the document reading system 10 through any 
number of conventbnal input/output devices 24. such 
as a mouse 26, a keyboard 28, or a pen-based interface 
30. The Input/output devices 24 communicate with an 

20 input/output interface 31 that, in turn, communicates 
with the processor 12, 

[0018] As shown in Fig. 1 , the system 10 is preferably 
implemented on a programmed general purpose com- 
puter. However, the system 1 0 can also be implemented 

2S using a special purpose computer, a programmed mi- 
croprocessor or micrcxsontroller and any necessary pe- 
ripheral integrated circuit elements, an ASIC or other in- 
tegrated circuit, a hardwired electronic or tagic circuit 
such as a discrete element circuit, a programmable logfc 

30 devbe such as a PLD, PLA FPG A or PAL. or the fike. 
In general, any device on which a finite state machine 
capable of implementing the flowchart shown In Fig. 4 
can t>e used to implement the system 10. 
[0019] Additionally, as shown in Fig. 1 , the storage de- 

35 vfces or memories 14 and 20 are preferably implement- 
ed using statb or dynamb RAM. However, the devices 

14 and 20 can also be implemented using a floppy disk 
and disk drive, a writable optbal disk and disk drive, a 
hard drive, flash mennory or the like. Also, it should be 

40 appreciated that the devbes 14 and 20 can be either 
distinct portbns of a sbgle memory or physbally distinct 
memories. 

[0020] Further, it should be appreciated that the links 

15 and 17 connecting the devbes 14 and 20 and the 
45 processor 1 2 can be a wired or wireless link to a network 

(not shown). The network can be a kxal area network, 
a WKle area network, an intranet, the Internet or any oth- 
er cfistributed processing and storage network. In this 
case, the electronb document 16 is pulled from and 

50 physically remote memory devbe 14 through link 15 for 
processing in the processor 12 according to the method 
outlined below. In this case, the electronic document 16 
can be stored bcally in portion of some other memory 
devbe of the system 10 (not shown). 

ss [0021] The method of this Invention identifies two 
kinds of target documents 22 for each source document 
1 6. The two types of target documents are: 1 )target doc- 
uments that are specifically related to annotated pas- 
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sages; and 2) documents that are generally related to 
the ovetall source document. Once a relationship is es- 
tablished between the source document and the target 
documents 22. the target documents may be displayed 
by citokingon selectable links in the displayed document 
16. 

[0022] References to the two types of target docu- 
ments 22 is shown fen Fig. 2. A target document 22 re- 
lated to the specific passage 32 in the source document 
16 is identified by a margin representatkxi 34 placed in 
the margin of the source document 16 near the related 
passage 32. As shown in Fig. 3, a target document 22 
that is related to the source document 16 as a whole is 
annotated and shown as an endnote 36 to the source 
document The end note 36 includes the type, the title 
and summary Infomnatk^n. 

[0023] Fig. 4 Is a flowchart outfining a control routine 
for one embodiment of the method of this invention. Be- 
ginning in step SI 00. the control routine continues to 
step SI 05. In step SI 05, the control routine determines 
if the user has made any annotations. If r>ot, control 
loops back to step SI 05. If so control continues to step 
8110. In step S1 10. the control routine determines the 
annotation of the source document mode by the user. 
Next, in step SI 20, the control routine analyzes the text 
of the source document and the annotation to determine 
the passage being annotated. A passage may include 
a paragraph marked with a margin bar, an urxlerlying 
sentence or phrase, or the context of one or more clicled 
terms. Then in step S1 30, the control routine generates 
a query from the passage. The query includes content- 
bearing terms from the identified passage that are 
weighted to give knportance to any circled words. Next, 
in step 8140 the control routine searches the target doc- 
ument using the query to kJentify documents that are 
related to the passage. Then, at step 8150, the search 
results are clustered. Clustering is preferably performed 
in a manner simitar to that described in "Reexamining 
the Cluster Hypothesis: Scatter/Gather on Retrieval Re- 
sults". M.A. Hearst et al.. Proceedanqs of ACM SIGIR 
'96. August 1996. Zurich, Switzerland, incorporated 
hereDi by referer)ce in its entirety. 
[0024] Next, in step SI 60, the control routine selects 
a typk^l document from each cluster. These documents 
are further filtered by a user-specified similarity thresh- 
old In step S170. Then, In step 8180. the remaining doc- 
uments are Identified by displaying links to those docu- 
ments in the margin of the source document adjacent to 
the passage from which the query was generated. Each 
selectable link may be an icon representing a type of the 
selected and filtered target document and a short title. 
[0025] Next, in step 8190, the control routine deter- 
m'nnes if a user has selected a selectable link in the cur- 
rent source docunnent. If in step 8190, a user has se- 
lected a selectable link, the control routine proceeds to 
step S200. In step S200. the target document is dis- 
played as the new current source document, control 
then continues back to step 81 05, where it waits for an- 



other annotatkm to be made. Alternatively, if on step 
8190, no selectable link is selected, then the control 
jumps directly back to step S105. The control routine 
continues until the user has closed all open source doc- 

s uments 1 6 displayed on the display 1 6. 

[0026] To compute end notes the flowchart of Fig. 4 
can be used with slight modifications. The control rou- 
tine proceeds identically as directed for the creation of 
margin notes from step SI 00 through step 8120. How- 
to ever, at step SI 30 a weighted sum query is generated. 
In step SI 30 tenms that are explk:itfy kientlfied by the 
reader and terms identified by standard relevance feed- 
back techniques are used to construct weighted-sum 
queries at step 81 30. The identffied terms are assigned 

IS weights based upon the annotations made to the docu- 
ment For distance, words that have been expressly se- 
lected by the user are weighted the highest and woids 
ttiat occur In selected paragraphs are wei^ted higher 
than the remaining tenms of the source document. 

20 [0027] Documents that have been identified as relat- 
ed to the document usrig the weighted sum query gen- 
erated in step 8130 are processed in a manner srnilar 
to the remaining steps SI 40 through 8200 with the ex- 
ceptk)n that the Ink is displayed as an end note in step 

25 S180 rather than as a margfri note. 

[0028] It shouki be understood that either or both of 
these control routines may be running in the background 
of a document reading system of the invention. 
[0029] Optbnally. the system and mettiod of this in- 

30 ventkan may derive summaries from documents through 
an automatks text summarization process In a manner 
similar to that descrtoed in "A Trainable Docunr^ent Sum- 
marizer", J. Kupiec et al.. Proceedings of SIGIR '95, Julv 
1995, Pittsburgh. PA. ACM Press, incorporated herein 

35 by reference in its entirety. The summaries are then dis- 
played as end notes. 

[0030] It is to be understood that the term annotation 
- as used herein is intended to include text, digital ink. 
audk>, video or any other input associated with a docu- 

40 ment. It is also to be understood that the tenm document 
is Intended to include text, vkieo, audb and any other 
media and any combination of media. Further, it is to be 
understood that the term text is intended to include text, 
dgital ink, audio, video or any other content of a docu- 

45 ment to include the document's structure. 



Claims 

50 1 . A method for dfeplaying in a display of a first docu- 
ment, at least one link to another document, each 
other document being related to the first document, 
the method comprising: 

ss identifying at least one user annotated segnrtent 

of the first document; 

identifying at least one second document that 
is related to the at least one annotated segment 
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of the first dcx:ument; and 

displaying in the first document a selectable link 

for each second document. 

2. The method of claim 1. wherein the selectable link 
is displayed as an end note to the first document. 

3. The method of claim 1 or claim 2, the step of iden- 
tifying the at least one second document comprising 
identifying at least one portbn of the at least one 
second document as related to the at least one arv- 
notated segment, the selectable link ref erertcbg the 
identified at least one portbn, the selectable link be- 
ing displayed in the margin adjacent to the at least 
one annotated segment and the step of Identifying 
being In response to the annotation of the at least 
one segment of the first document 

4. The method of any one of claims 1 to 3, wherein the 
step of identifying the at least one second document 
comprises determining the relatedness based upon 
user klentified terms and terms identified using rel- 
evance feedback techniques. 

5. The method of any one of claims 1 to 4, further com- 
prising the steps of: 

determining if the selectable link has been se- 
lected; and 

displaying the kJentifted at least one second 
document in response to the selectbn of the 
selectable link. 

6. An electron^ document system for suggesting in a 
display of a first document at least one second doc- 
ument that Is related to the first document, the sys- 
tem comprising: 

a processor that identifies at least one user an- 
notated segment of the first document and that 
Identifies at least one second document as re- 
lated to the annotated segment of the first doc- 
ument; and 

a display that displays a selectable link that ref- 
erences the kientined at least one second doc- 
ument in a display of the first document. 

7. The system of claim 6, wherein the selectable link 
Is displayed as an end note to the first document. 

8. The systenn of any one of claims 6 and 7, wherein 
the processor kientifies the at least one second doc- 
ument based upon user identified terms and terms 
identified t>ased upon relevance feedback tech- 
niques. 

9. The system of any one of claims 6 to 8. further com- 
prising a user input interface, wherein the processor 



is responsive to the artnotation of a segment of the 
first document by the user to identify the at least one 
second document 

s 10. The system of any one of claims 6 to 9, further com- 
prising a user interface, wherein the display Is re- 
sponsive to the selectbn of the selectable link by 
the user to display the identified at least one second 
document. 

10 
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(54) A method and system for suggesting related documents 



(57) The document reading system passively ana- 
lyzes a document to generate margin or end notes of 
references to other documents that relate to annotated 
passages in the document or to the entire document 
The invention is responsh^e to the annotation of a doc- 
ument to passively generate a query that retrieves doc- 
uments that have similar content to the annotated pas- 
Ihe retrieved documents are available to the 



reader through selectable links placed in the margin 
near the annotation. Additionally, the invention provides 
end notes with links to documents that are similar in con- 
tent to the overall content of the annotated document 
The inventfon assists the reader by passively generating 
selectable links to related documents to assist the user 
m relating the new document to previously read materi- 
al. 
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